Published online 21 March 2011 



Nucleic Acids Research, 2011, Vol. 39, No. 13 5513-5525 

doi:10.1093/nar/gkrl31 



Diversity of bacterial type II toxin-antitoxin 
systems: a comprehensive search and 
functional analysis of novel families 

Raphael Leplae 1 , Damien Geeraerts 2 , Regis Hallez 2 , Julien Guglielmini 2 , Pierre Dreze 2 
and Laurence Van Melderen 2 * 

1 Laboratoire de Bioinformatique des Genomes et des Reseaux (BiGRe), Faculte des Sciences, Universite Libre 
de Bruxelles, Bid du Triomphe, 1050 Bruxelles, Belgium and 2 Laboratoire de Genetique et Physiologie 
Bacterienne, Institut de Biologie et de Medecine Moleculaires, Faculte des Sciences, Universite Libre 
de Bruxelles, 12 rue des Professeurs Jeener et Brachet, 6041 Gosselies, Belgium 



Received December 23, 2010; Revised and Accepted February 22, 2011 



ABSTRACT 

Type II toxin-antitoxin (TA) systems are generally 
composed of two genes organized in an operon, 
encoding a labile antitoxin and a stable toxin. They 
were first discovered on plasmids where they con- 
tribute to plasmid stability by a phenomenon 
denoted as 'addiction', and subsequently in bacter- 
ial chromosomes. To discover novel families of 
antitoxins and toxins, we developed a bioinformat- 
ics approach based on the 'guilt by association' 
principle. Extensive experimental validation in 
Escherichia coli of predicted antitoxins and toxins 
increased significantly the number of validated 
systems and defined novel toxin and antitoxin 
families. Our data suggest that toxin families as 
well as antitoxin families originate from distinct an- 
cestors that were assembled multiple times during 
evolution. Toxin and antitoxin families found on 
plasmids tend to be promiscuous and widespread, 
indicating that TA systems move through horizontal 
gene transfer. We propose that due to their addict- 
ive properties, TA systems are likely to be main- 
tained in chromosomes even though they do not 
necessarily confer an advantage to their bacterial 
hosts. Therefore, addiction might play a major role 
in the evolutionary success of TA systems both 
on mobile genetic elements and in bacterial 
chromosomes. 



INTRODUCTION 

Toxin-antitoxin (TA) systems are small genetic modules 
found on bacterial mobile genetic elements as well as in 
bacterial chromosomes. They appear to be specific of the 
eu-bacterial and archae-bacterial worlds as no homolo- 
gous sequences are detected in eukaryotic genomes 
(except for the PIN toxin domain which is present in eu- 
karyotes) (1). Toxins are always proteins but, based on the 
nature of the antitoxin and its mode of action, TA systems 
are currently divided into three classes. Antitoxins of type 
I and III systems are small RNAs that inhibit either toxin 
expression (type I) or activity (type III) (2,3). Antitoxins of 
type II systems are proteins that inactivate toxins by 
protein-protein complex formation (4). Type I and II 
systems were discovered in the mid-1980s on bacterial 
plasmids while type III was discovered only recently. 
For type I and II systems, it became rapidly clear that 
these systems were dedicated to plasmid maintenance. In 
addition to several mechanisms that prevent plasmid-free 
cells production, plasmids have evolved subtle molecular 
systems that act by killing plasmid-free daughter bacteria 
after plasmid loss. Type I and II systems are responsible 
for this phenomenon denoted as post-segregational killing 
or addiction (5,6). Addiction relies on the differential sta- 
bility of the two components, antitoxins being less stable 
than toxins. In daughter bacteria that do not inherit a 
plasmid copy, the antitoxin and toxin pool is not replen- 
ished. Since the antitoxin is labile, the toxin is released 
from inhibition, leading to plasmid-free cell death. The 
cell is thus addicted to antitoxin production and in 
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extenso to the presence of the TA genes (5). Therefore, 
type I and II systems contribute to an apparent plasmid 
stabilization at the population level. An extension of this 
function for type II systems is the capacity of plasmid- 
encoded systems to outcompete plasmids of the same 
incompatibility group devoid of TA systems (7). 

Bacterial genome sequencing and database mining 
revealed that homologs of type I and II systems are 
found in bacterial chromosomes; the occurrence of type 
II systems being surprisingly high (8-12). Analysis of their 
distribution provided information regarding their evolu- 
tion. While type I systems appear to have evolved by 
lineage specific-duplication (13), type II systems are 
thought to move from one genome to another by horizon- 
tal gene transfer (9,14). The number of type II systems 
varies greatly not only from one bacterial species to 
another, but also between isolates from the same species 
(9,15). The function(s) of chromosomally encoded type I 
and II systems remain(s) unclear (2,16,17). Although it has 
been proposed that type II systems might serve as stress 
response modules, convincing data are still lacking (18). 
Recent studies have shown that type II systems are 
involved in the stabilization of large genomic frag- 
ments (19) and of integrative conjugative elements (20), 
indicating that some TA systems might have retained 
their addictive properties. 

Genetic organization of known type II TA systems is 
quite conserved. Typically, these systems comprise two 
small genes organized in an operon, the upstream gene 
encoding the antitoxin. General features such as small 
size of both components (31-204 amino acid for antitoxins 
and 41-206 amino acid for toxins) and short intergenic 
regions separating the two genes (—20 to +30 nt) are 
also well conserved (9). In general, antitoxins are com- 
posed of two domains, an amino-terminal domain respon- 
sible for DNA binding and a carboxy-terminal domain 
responsible for toxin interaction. The antitoxin-toxin 
protein complex, in which the toxin is inactive, is also 
responsible for negative autoregulation. Toxins typically 
are small, stable proteins that inhibit either replication by 
interacting with DNA-gyrase, or translation by cleaving 
messenger RNAs (mRNAs) or inhibiting elongation. 

Classification of the type II TA systems is based on the 
amino acid sequence similarity of the toxins, each toxin 
family being associated with a specific antitoxin family. 
Thus, type II systems are currently divided into 10 
families (8,9). However, a few examples of 'hybrid' asso- 
ciations of a toxin from one family and an antitoxin be- 
longing to another family have recently been characterized 
(21,22). Furthermore, novel putative families of toxins and 
antitoxins (i.e. not homologous to the 10 families) have 
been predicted by bioinformatics approaches recently and 
some were found to be associated with known toxin or 
antitoxin families, indicating that 'hybrid' systems might 
be more common than originally thought (12,23,24). In 
fact, one such prediction in Escherichia coli K-12 was 
recently validated experimentally (8). Thus, predictions 
are that the type II TA systems are much more 
abundant and diversified than what is currently described. 
To evaluate this diversity, a bioinformatics approach was 
developed to explore prokaryotic genomes and to identify 



novel toxins and antitoxins. From our predictions, 18 
antitoxin and 23 toxin sequences originating from 
different bacterial species were validated experimentally 
in E. coli, significantly increasing the number of families 
of type II antitoxins and toxins. Therefore, we propose to 
refer to antitoxin and toxin families independently instead 
of referring to TA system families. 

MATERIALS AND METHODS 

Bioinformatics approach 

This approach is described in details in Supplementary 
'Materials and Methods' section and summarized in 
Figure 1. 

Bacterial strains, plasmids and media 

Bacterial strains. The following E. coli strains were used: 
MC1061 (F~ araD 139 , A ( ara-leu) 7696 , galE15, ga!K16. 
A(lac)X74, rpsL (Str r ), hsdR2 (rETmlt), mcrA mcrBl) 
(25), MG1655 (rphl ilvG rfb-50) (26), DJ624 Aara 
(MG1655 lacX74 malr.lacF) (27) and DJ624;u/L4 ::lacZ 
(this work). 

Plasmids. The following plasmids were used and their 
relevant characteristics are indicated. The pBAD33 
vector (pl5A, Cm 1 ', pBAD promoter) (28), the pBAD33- 
yoeB plasmid (pBAD33 derivative containing the yoeB 
toxin gene under the control of the pBAD promoter) 
(29), the pKK223-3 vector (ColEl, Amp 1 ", pTac 
promoter) (30) and the pLac-staby vector 
(DelphiGenetics, Belgium) were used in this work. 

Media. Luria-Bertani liquid and agar medium (LB) 
(Invitrogen) as well as M9 minimal liquid and agar 
medium (KH 2 P0 4 (22 mM), Na 2 HP0 4 (42 mM), NH 4 C1 
(19 mM), MgS0 4 (ImM), CaCl 2 (0.1 mM), NaCl (9mM) 
supplemented with casamino acids [0.2% except for me- 
thionine incorporation (0.05%)] and carbon sources 
[glycerol, glucose or arabinose (1%)]. 

Antibiotics. Antibiotics were added at the following con- 
centrations: chloramphenicol, 20ug/ml; ampicillin, 
100 ug/ml. 

Experimental validation of novel toxins and antitoxins 
sequences 

The experimental validation of the putative novel toxins 
and antitoxins families relies on a simple assay: expression 
of the putative toxins should cause growth inhibition while 
co-expression with their putative cognate antitoxins 
should restore normal growth. To that end, antitoxin 
and toxin candidate genes were cloned in compatible ex- 
pression vectors under the control of inducible promoters. 

Toxins cloning. Predicted toxin genes were cloned in the 
pBAD33 vector, under the control of the arabinose- 
inducible pBAD promoter. The predicted toxin coding se- 
quences (CDS) were amplified from bacterial genomic 
DNA using the appropriate start and stop primers 
(Supplementary Table SI). Start primers carry a canonical 
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Shine-Dalgarno (SD) sequence. Polymerase chain reaction 
(PCR) products were digested by Xbal and PstI, except 
the sauT3 C oL, nspT5 PC 7i2o, speT2 TIGR4 , speTl TIGR4 amp- 
lification products that were digested by Sad and Kpnl; 
and ccrT3 C Bis that was digested by Xbal and Sail. 
Digested products were ligated into the pBAD33 vector 
cleaved with the appropriate restriction enzymes. Toxins 
that did not confer an unambiguous growth inhibition 
phenotype when expressed from the pBAD33 vector 
were cloned in the pKK223-3 vector under the control 
of the pTac promoter. The PCR products were digested 
by EcoRI and PstI and cloned in the pKK223-3 vector 
cleaved with the same restriction enzymes. 

Antitoxins cloning. Predicted antitoxin genes were cloned 
in the pLac-staby vector and/or in the pKK223-3 vector, 
which places the antitoxin genes under the control of the 
pLac and pTac promoters, respectively. Expression from 
both promoters is induced by isopropyl P-D-l-thiogalacto- 
pyranoside (IPTG) addition. Predicted CDSs were 
amplified from genomic DNA using the appropriate 
start and stop primers (Supplementary Table SI) and 
PCR products were ligated in the pLac-staby vector. 
The predicted antitoxins that failed to counteract the 
growth inhibition of their cognate toxins were cloned in 
the pKK223-3 vector under pTac control. The PCR 
products were digested by EcoRI and PstI and cloned in 
the pKK223-3 vector cleaved with the same restriction 
enzymes. To test the activity of antitoxins associated 
with toxins cloned in the pKK223-3 vector, the cognate 
antitoxin genes were cloned in the pBAD33 vector. PCR 
products were digested by Xbal and PstI and cloned in the 
pBAD33 digested with the same restriction enzymes. 

Killing /rescue assay. The DJ624Aara strain was trans- 
formed by toxin-encoded plasmids and/or by antitoxin- 
encoded plasmids and the control vectors. Transformants 
were grown overnight in M9 liquid medium containing 
glucose (1%) and the appropriate antibiotics. Overnight 
cultures were diluted 100-fold in M9 medium containing 
glucose (1%) and the appropriate antibiotics and grown at 
37°C to an OD 600 nm of 0.2. Dilutions (10°-10 6 ) were 
plated on M9 plates containing either glucose (1%) or 
arabinose (1%) and IPTG (ImM) and the appropriate 
antibiotics. Plates were incubated overnight at 37°C. 

SOS induction 

The D J 624XsfiA r.lacZ strain containing the pBAD33, 
pBAD-parEjva, pBAD-ccrT4 C BJ5 or pBAD-atuTl c58 
plasmids was grown in M9 medium supplemented with 
glucose (1%) and chloramphenicol to an OD 600 nm of 
0.1. Cultures were centrifuged and pellets resuspended in 
M9 medium containing arabinose (1 %) and chlorampheni- 
col. After 120 min of induction, aliquots were taken to 
perform (3-galactosidase assays as described in ref. 3 1 . 

35 S-methionine incorporation 

The D]624Aara strain containing the pBAD33 vector and 
its derivates containing the experimentally validated toxin 
genes and appropriate controls were grown in M9 medium 



supplemented with glycerol (1%) at 37°C. Toxin 
over-expression was induced at an OD 600 nm of 0.1 by 
adding arabinose (1%). After 180 min of induction, 1ml 
aliquots were taken and 5 uCi of 35 S-methionine was 
added. After 2min of incubation at 37°C, 500 ul aliquots 
were removed and added to tubes containing 5 ml of cold 
10% TCA and left 20 min on ice. Samples were then 
filtered on nitrocellulose filters (0.45 urn) saturated with 
the non-labeled precursor using a glass funnel. Filters 
were then washed in 10% TCA and air-dried. Filter- 
retained material was counted in 10 ml of scintillation 
liquid (Ready Protein +) in a liquid scintillation counter 
Beckman). Translation rate was normalized to the 
5 S-methionine incorporated by the control strain 
(pBAD33 vector). 

RESULTS 

Detecting 'similar' and potentially novel toxins and 
antitoxins 

We collected the antitoxin and toxin sequences of type II 
systems that were experimentally validated and character- 
ized at the time we started this work. In addition to the 
relBE, parDE, vapBC, mazEF, phd-doc, higAB, hipBA, 
omega-epsilon-zeta and ccd systems and their homologs, 
sequences of the vapXD system were added to our data set 
even though this system is poorly documented (32). We 
also included sequences of the parD EDL933 -parE3, 
paaA2-parE2 (33) and yaj'NO systems that were identified 
by our procedure and tested as 'proof of concept'. Note 
that yaj'NO characterization was published during the 
course of this work (34). The hicAB system was not 
included since it was not validated at the time we started 
this analysis (8). Our query data set was therefore com- 
posed of 24 sequences of antitoxin and 24 sequences of 
toxin that were denoted as 'original' sequences 
(Supplementary Table S2). We performed a comprehen- 
sive PSI-BLAST search (see Supplementary 'Materials 
and Methods' section) on 2181 prokaryotic genomes and 
detected more than 10000 sequences. These sequences 
were grouped in eight toxin super-families and nine of 
antitoxin (Supplementary Tables S3 and S4). 

Based on the conservation of type II systems organiza- 
tion, we developed an exploratory bioinformatics ap- 
proach based on the 'association by guilt' of orphan 
toxins or antitoxins (i.e. not paired with 'original' anti- 
toxins or toxins) to discover novel sequences potentially 
showing antitoxin or toxin activities (Figure 1 and see 
Supplementary 'Materials and Methods' section for de- 
tails). These sequences were grouped into protein 
families based on their similarities (see Supplementary 
'Materials and Methods' section for the detailed proced- 
ure and parameters). As a result, we distinguished two 
types of predicted antitoxin or toxin sequences: those be- 
longing to a family composed of proteins of unknown 
function which were called AG (for 'associated by 
guilt'), and those 'associated by guilt' but present in a 
family containing at least one 'original' sequence or one 
sequence presenting similarity with 'original' sequences 
(denoted as 'similar', see Supplementary 'Materials and 
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Figure 1. Association by guilt bioinformatics approach. Detection of 'associated by guilt' (AG) sequences (A). 'Similar' sequences (purple box) to the 
48 'original' sequences (white box) were detected by PSI-BLAST searches using the indicated parameters (gray boxes) and a database composed of 
2181 genome sequences (blue cylinder). First round: AG toxins and antitoxins denoted as AG1 (orange box) were identified by paring of the 'similar' 
sequences using the indicated parameters (gray box). Number and type of sequences are indicated (T for toxin, A for antitoxin and U for 
unassigned). Second round: AG1 sequences were used as query for a second round of detection using the same parameters as in the initial steps 
to identify AG-like sequences (orange box). AG2 sequences were identified by pairing the AG-like sequences. Number and type of sequences are 
indicated. Pair definition (B). An additional round of pair definition was performed using sequences detected in (A) using the indicated parameters 
(gray box). Protein families (C). Protein families (green box) were generated by grouping all the proteins present in the dataset using the Markov 
clustering algorithm (MCL) with the indicated parameters (gray boxes). Names of 'original' or 'similar' sequences were propagated to the AG 
proteins within the same protein family. These sequences were then defined as 'associated by guilt and annotated' (AGA). 



Nucleic Acids Research, 2011, Vol. 39, No. 13 5517 



Methods' section for details). Super-family names of these 
'original' and/or 'similar' sequences were transferred to 
these AG sequences and they were denoted consequently 
'associated by guilt and annotated' (AGA). 

Experimental validation of predicted toxins and antitoxins 

Our bioinformatics exploratory procedure led to the pre- 
diction of a large number of novel AG and AGA toxins 
and antitoxins located upstream and/or downstream of 
orphan 'similar' and originating from numerous bacterial 
species. 

Experimental validation of these AG and AGA se- 
quences was carried out in E. coli using a simple Kill/ 
Rescue assay, in which expression of the toxin gene is 
expected to inhibit cell growth while co-expression of the 
toxin and antitoxin genes restores growth. 

The toxin and antitoxin sequences were cloned in com- 
patible vectors, under the control of different inducers (see 
'Materials and Methods' section). These sequences were 
placed in the same genetic context, i.e. RBS and ATG as 
start codon, to achieve a comparable level of expression, 
although we cannot rule out that some genes are less ex- 
pressed than others as we did not measure expression 
levels. 

Predicted toxins 

Twenty-three AG toxins were experimentally tested. Eight 
of them are paired with sequences 'similar' to 'original' 
antitoxins (Supplementary Table S5). Expression of the 
putative BceT5 E33L , SpyT2 102 7o and LmoTl EGD _ e toxins 
did inhibit E. coli growth. Among the upstream open 
reading frames (ORFs) predicted to be antitoxins, only 
BceA5 E33L , which is associated with the BceT5 E33L 
toxin, turned out to be a functional antitoxin belonging 
to the HigA super-family of antitoxins. For the SpyT2 10 27o 
toxin, the antitoxin activity of the downstream Spy A2 10270 
ORF, although not predicted by our approach, was tested. 
Co-expression of SpyA2 102 7o relieved SpyT2 10 27o- 
mediated cell growth inhibition. In this case, the antitoxin 
gene is located downstream of the toxin gene. For the 
third toxin that inhibits E. coli growth, LmoTl EGD . e , 
none of the flanking ORFs showed antitoxin activity. 

Seven AG toxins associated with AGA antitoxins belong- 
ing to the HigA super-family were tested (Supplementary 
Table S6). Expression of four of them (SmeTl 10 2i, 
MavTl K10 , SpyTl 942 9 and SpyT3i 0 27o) inhibited E. coli 
growth. However, only two of the predicted antitoxins 
(SmeAl 10 2i and MavAl K10 ) relieved the growth inhibition 
mediated by their cognate toxins. For the SpyTl 942 9 and 
SpyT3 10 27o toxins, we were unable to detect antitoxin activity 
for their flanking ORFs. 

Eight AG toxins associated with eight AG antitoxins 
were experimentally tested (Supplementary Table S7). 
Expression of BceTl E33L , SpyTl 10 27o, SpyTl M1 and 
EcoTl EDL933 resulted in E. coli growth inhibition and 
co-expression of their paired antitoxins relieved growth 
inhibition (Figure 2). 

One predicted pair (MGAS 10270 Spy0568 and 
Spy0569) presented a particular phenotype. This pair is 
composed of sequences similar to the E. coli hicAB 




10° lO 2 10-" 10 6 



Figure 2. The ecoAl-ecoTl EDL933 and the nspA5-nspT5 PC7120 pairs 
constitute TA systems. The DJ624Aara strain containing the pBAD33 
and pLac-staby plasmids (1), the pBAD33-ec0l7 EDL933 and pLac-staby 
plasmids (2), the pBAD33-ero77 EDL933 and pL&c-si&by-ecoAl EDL933 
plasmids (3), the pBAD33-nspT5 PC712 o and pLac-staby plasmids (4) 
or the pBAD33-nspT5p C7J 2o and pLac-staby- nspA5 PC7]2 n plasmids 
(5) were grown in log phase in M9 medium supplemented with 
glucose (1%) and appropriate antibiotics. Serial dilutions (as indicated) 
were spotted on LB plates containing arabinose (1%) and IPTG 
(ImM). Plates were incubated overnight at 37°C. 



system (8). HicA was shown to be a toxin cleaving 
mRNAs and HicB was shown to be its antitoxin. Our 
prediction was opposite: Spy0568, the S. pyogenes HicB 
homolog was predicted to be a toxin and Spy0569, the 
HicA homolog, the antitoxin. Surprisingly, expression of 
both proteins was toxic for E. coli (data not shown). We 
therefore renamed Spy0568 as SpyT4 10 27o and Spy0569 as 
SpyT5 10 27o- The flanking ORFs being located on the com- 
plementary strand, we did not check their antitoxic 
activity. 

Predicted antitoxins 

Eleven AG antitoxins were experimentally tested. Eight 
predicted antitoxins are associated with sequences 'similar' 
to 'original' toxins (Supplementary Table S8). Expression 
of these toxins inhibited E. coli growth except for the 
RPC4130 ORF. The seven cognate antitoxins relieved 
the toxicity mediated by their associated toxins. 

Three of these antitoxins (CcrAl CB15 , SpeA2 TIGR4 and 
CcrA4 CB15 ) were validated by other groups during the 
course of this work, although their annotation turned 
out to be incorrect (9,35). The CcrAl CB15 antitoxin is 
associated with a sequence similar to ParE and was 
described as being a ParD RK2 -like antitoxin in 
Caulobacter crescentus (36). We were unable to detect 
any sequence similarity between this sequence and the 
ParD RK2 antitoxin. The second case concerns the 
SpeA2 TIGR4 antitoxin, which is associated with a toxin 
from the RelE family in Streptococcus pneumoniae. This 
ORF was named Re\B2spn in a previous work (35). 
However, again, we were unable to detect any sequence 
similarity between this sequence and RelB K -i2- The last 
case concerns the ccrA4 CB i5-ccrT4 C Bi5 system that was 
annotated as relBE2 in C. crescentus (9) and experimen- 
tally analyzed (36). The CcrT4 CB i 5 toxin belongs to the 
ParE/RelE super-family. We showed that expression of 
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this toxin induces the SOS response, although less efficiently 
than ParE RK2 (Figure 3), confirming our predictions and 
indicating that CcrT4 CB15 is a ParE-like toxin. The 
associated CcrA4 CB15 antitoxin, however, does not show 
any similarities to 'original' antitoxins. Thus, these three 
antitoxins represent novel sequences associated with 
'similar 'toxins that present a growth inhibition phenotype 
do not show any similarities with 'original' antitoxins and 
were not described previously. The AtuAl c58 antitoxin is 
associated with AtuTl c5g , which belongs to the ParE/RelE 
toxin super-family. Expression of this toxin induces the 
SOS system, indicating that AtuTl c58 is a ParE-like 
toxin (Figure 3). The NspA5 PC 7i2o and AfuA2 DSM43 o4 
antitoxins are paired with toxins from the VapC super- 
family and the SpeA3 TIG R4 antitoxin is associated with a 
toxin belonging to the Doc super-family. 

Three AG antitoxins associated with AGA toxins 
were experimentally tested (Supplementary Table S6). 
Expression of these three toxins (NspTlp C712 o, 
NspT2 PC712 o and NeuTl C91 ) resulted in E. coli growth 
inhibition and co-expression of their associated antitoxins 
relieved this inhibition (Figure 2). Two of these toxins 
belong to the VapC super-family and the other to the 
ParE/RelE super-family. 

Thus, this approach led to the validation of 23 toxins 
and 18 antitoxins. For 5 of the 23 toxins, we were unable 
to identify the cognate antitoxin, suggesting that they 
might be solitary toxins or that expression levels of some 
of the antitoxins might be too low to rescue the toxic 
phenotype. Eleven of the 34 predicted toxins did not 
affect E. coli growth; it is possible that, as mentioned 
above, expression levels are too low; these sequences 
might be false-positives (not related to toxins) or they 
might be 'true' toxins but only for their natural host and 
not for E. coli. 

Toxin activities 

To gain insights into the mode of toxicity of the novel vali- 
dated toxins, several tests were performed. Cell morph- 
ology was observed by microscopy and SOS induction, 
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Figure 3. Overexpression of the CcrT4 CB15 and AtuTl C58 toxins induce 
the SOS system in E. coli. The D]624ksfiA v.lacZ strain containing 
the pBAD33 (1), pBAT>-parE RK2 (2), pBAT>-ccrT4 CB i5 0) or pBAD- 
atuTl c5S (4) plasmids were grown in M9 medium. After induction 
of toxin expression by arabinose addition, samples were taken to 
perform P-galactosidase assays as described in 'Materials and 
Methods' section. 



transcription as well as translation rates were measured 
in E. coli overexpressing the active toxins. In addition to 
the novel toxins, we also tested the AGA NspTl PC712 o and 
NspT5pc7i2o toxins belonging to the VapC super-family in 
order to confirm this annotation. Neither cell morph- 
ology, SOS induction nor transcription rates were 
affected by toxin overexpression (data not shown and 
Supplementary Figure SI). However, expression of these 
toxins drastically reduced 35 S-methionine incorporation 
(from 10% to 30% as compared to the control vector) 
(Figure 4). These data show that all of the tested toxins 
affect translation. In addition, preliminary data seem to 
indicate that expression of some of these toxins induces 
mRNA cleavage in E. coli (data not shown). 

Evolutionary relationships between novel and 'original' 
super-families 

Six of the 23 toxins validated experimentally define four 
novel super-families (Table 1). These super-families were 
denoted as Gin (for growth inhibition). 

Three of the novel toxins constitute the GinA 
super-family (SpyTl 1027 o, SpyT2 10270 and BceTl E33L ). 
The GinB super-family is composed of the SmeTl 102 i 
novel toxin as well as the YgjN toxin, which was identified 
in E. coli during the course of this work (37). This toxin 
was considered as a homolog of RelE (37). However, while 
we detected sequence similarity between SmeTl 102 i and 
YgjN (data not shown, E-value: 10e-7), we were unable 
to detect any sequence similarity between SmeTl 102 i and 
the members of the ParE/RelE super-family. The GinC 
and GinD super-families are composed of SpyTl M i and 
BceT5 E3 3L, respectively. Regarding the solitary toxins, 
they define four novel super-families (data not shown). 

Twelve of the 18 antitoxins that were experimentally 
validated define 11 novel antitoxin super-families that 
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Toxin 

Figure 4. Overexpression of the novel toxins inhibits translation in 
E. coli. The DJ624Aara strain containing the pBAD33 (1), pBAD- 
parE RK2 (2), pBAD-joeB (3), pBAD-ecoTl EDL933 (4), pBAD-mavTl KW 
(5), pBAT>-spyTl l027 „ (6), pBAD- S pyT2 W2 7o 0), pBAD-bceT] EJ3L (8), 
pBAD-smeTl ,„,j (9), pBAD-spyTl M1 (10), pBAD-bccT5 EJ3L (11), 
pBAD-n S pTl PC7120 (12), pBAD-mpT2 PC712n (13), pBAD-spyT4 10270 
(14), pBAD-spyT5,„ 270 (15), pBAT>-spyT3 W27 „ (16), pBAD-spyTl 9429 
(17) and pBAD-lmoTl EGD _ L , (18) were grown in M9 medium. After in- 
duction of toxin expression by arabinose addition, cultures were labeled 
with 35 S-methionine. Translation rate was measured as described in 
'Materials and Methods' section. 



Nucleic Acids Research, 2011, Vol. 39, No. 13 5519 



H 
c 



U 



< 

p. 



n 



on « 



CQ 
-3 

0 



_ — 

N rt 



a! 3 
7 35 



z 



3 o 
5 a. 

an 

(N CL 
f— " 



H 

Z 



H Oh O 



2 u 

b i— 

P & 

&Z b 

Z z 



H 

Z 



f— 

Z 



H 

Z 



f— 

z 



Z 



Z 



f— 

Z 



Z 



Z 



H 

Z 



H 

Z 



„>H O— (N 3 

o z 



Z 



9 ~ 

J? > 

S 2 



•3 c 

2 oj 

S = 

m ST 



< 

0 o 

-< ss < s 

•a c ^ e 

2 u B y 

a 5, =3 S. 

la °> la 50 



a 



C 
Pi 



c 



m 

M 

c 

Cm 
O 



u -a 
B ti -5 



S so *-i 



C 

.SPH a 

"3 

> 



OJ 



s S ; 

s 5 ? 

S " fi 

'Si ^ o 

£P o " 

>,X> SB 

g S > 

H H 

~ S -s 

C- (/) 

-»-> on +3 

U CD kJ 

N S " 

h k r 



are unrelated to the 'original' super-families. They were 
named Fiz for full toxin neutralization (Table 2). 

Association between antitoxin and toxin super-families 

In general, a large number of predicted toxin sequences 
are associated with antitoxin AG sequences (Figure 5). 
Three main categories can be defined: toxin super-families 
that have multiple partners (>4) such as the large ParE/ 
RelE, VapC and CcdB/MazF super-families and the 
smaller GinA super-family (Figure 5A). The second 
category is composed of Doc, GinC and Zeta super- 
families. Sequences belonging to these super-families are 
associated with three different antitoxin super-families 
(Figure 5B). In the last category, toxin super-families are 
associated with AG sequences and a second super-family, 
Phd for the YafO super-family and HigA for the HipA, 
GinB and GinD super-families. 

As observed for the predicted toxins, a large number of 
predicted antitoxins are associated with AG sequences 
(Figure 6). Categories similar to those defined for toxin 
super-families can be defined. Three super-families of anti- 
toxin are promiscuous and associate with five or more 
toxin families i.e. the large PhD and HigA super-families 
(Figure 6A). The second category groups super-families 
that associated with three or four different toxin 
super-families such as the RelB and FizD super-families 
(Figure 6B). Most of the antitoxin super-families are quite 
restrictive in their associations (Figure 6C). They are 
associated with AG sequences and a second toxin super- 
family. Five of the novel super-families associate exclu- 
sively with a specific toxin super-family (FizB, FizF, 
FizI, FizJ and FizK) (Figure 6D). 

Phyletic and genomic distribution of the toxin and 
antitoxin super-families 

Phyletic distribution varies from one super-family to the 
other. The ParE/RelE, Zeta, VapC, Doc, CcdB/MazF, 
HipA and GinB super-families of toxins are widely dis- 
tributed in bacterial phyla, although GinB is not detected 
in Firmicutes (Figure 7A). At least 70% of their members 
are detected in Proteobacteria, Firmicutes (except GinB), 
Cyanobacteria and Actinobacteria. The remaining se- 
quences are detected in various phyla, with all of them 
except HipA and GinB being present in Archae. The 
GinA and VapD super-families, on the other hand, are 
restricted to a smaller number of phyla. Note that the 
VapD sequences are neither detected in Firmicutes nor 
in Archae. The GinC, YafO and GinD are specific to a 
single phylum. Most of the toxin sequences are located in 
chromosomes (Supplementary Figure S2A), except for the 
VapD sequences that are detected on plasmids (40%) and 
GinA sequences on phages (30%). The GinC, YafO and 
GinD sequences are detected neither on plasmids nor on 
phages. 

As observed for the toxin super-families, phyletic distri- 
bution of the antitoxin super-families varies from one 
super-family to the other (Figure 7B). The HigA, FizA, 
FizB, FizC, Phd, VapB, PasA and RelB super-families 
appear widely distributed in bacterial phyla. At least 70 
% of the members of these super-families are found in 
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Table 2. Occurrence of 20 antitoxin super-families in 2181 prokaryotic genomes 



Antitoxin superfamilies (n = 10 829) 





Phd 


RelB 


PasA 


VapB 
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VapX 


CcdA 
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(n — 304) 
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'Original' 
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RelB K . 12 , 
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VapX 


CcdA F 
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ParD RK2 




sequences 


Axe, PasB, 


DinJ, 


ParD EDL933 


MazE, 
















StbD, YafN, 


Paal 




ChpBI, 
















RelB 307 






PemI 














Validated 


NT 


NT 


NT 


NT 


YgjM a 


NT 


NT 


NT 


NT 




'similar' 










BceA5 E 33L 












sequences 






















Validated AGA 


NT 


NT 


NT 


NT 


SmeAl 10 2i 


NT 


NT 


NT 


NT 




sequences 










MvAl K10 












Validated AG 


NT 


NT 


SpeA2 T j GR4 


SpeA3 T ioR4 


EcoAl EDL933 


NT 


NT 


NT 


NT 


SpyA2 1070 


sequences 























Antitoxin superfamilies (n = 10 829) Total 



FizB FizC FizD FizE FizF FizG FizH FizI FizJ FizK 20 

(n = 1542) («=1121) («=103) (n = 26) (n = 10) (n = 9) (n = 8) (n = 6) (n = 5) (n = 1) 



'Original' 24 
sequences 

Validated 'similar' - 1 
sequences 

Validated AGA - 2 
sequences 

Validated AG CcrAl CB i5 SpyAl MI NeuAl c91 AtuAl C58 SpyAl l0270 NspA2 PC7l20 , AfuA2 DSM4304 CcrA4 CB i 5 BceAl 33L NspAl PC712 o 15 

sequences NspA5 P c 7 i 2 o 



The HigA super-family might contain a significant number of false-positive sequences that were selected on the basis of the HTH-XRE domain found 
in the 'original' HigA sequence. 

n represents the number of sequences; NT for not tested. 

"Indicates that the YgjM antitoxin was validated during the course of this work (37). 



Proteobacteria, Firmicutes and Cyanobacteria (except for 
RelB sequences). The remaining 30% varies depending on 
the family, with members of HigA, FizA, Phd, VapB and 
PasA super-families being detected in Archae. The other 
super-families appear to be much more restricted. A 
majority of the FizG sequences are detected in 
Cyanobacteria. Sequences of FizE and FizF super-families 
are distributed in two phyla: Proteobacteria and Green 
sulfur bacteria. Members of the CcdA, ParD and FizI 
super-families are exclusively found in Proteobacteria, 
while those of VapX and Epsilon are restricted to 
Firmicutes and FizH to Archae. The only representative 
of FizK super-family is found in Cyanobacteria. 

Regarding their genomic location (Supplementary 
Figure S2B), more than 85% of the sequences are 
located on chromosomes except for the RelB, FizF, 
CcdA and ParD super-families with 20-40% of the se- 
quences being located on plasmids. The Epsilon super- 
family represents an exception with its 11 representative 
sequences being found only on plasmids. 

Genome size and number of type II TA systems are not 
correlated 

Figure 8 shows that there is no significant correlation 
between the number of predicted toxin and antitoxin se- 
quences and the total number of ORFs predicted in a 
given genome. Even the smallest genomes of some of the 
intracellular obligate species seem to contain predicted 



toxin and antitoxin sequences (Table 3). While Buchnera 
and Chlamydial Chlamydophyla genomes are devoid of 
predicted toxin and antitoxin sequences, some 
Rickettsias (such as R. hellii OSU 85-389, R. bellii 
RML369-C, R. akari or R. felis) contain more than 20 
predicted sequences, representing between 1.8% and 
2.6% of the total number of ORFs. Interestingly, the 
number of predicted toxins and antitoxins in the 
Rickettsia genus is quite variable (0-36). Variability is 
also found within the Wolbachia genus (0-6 predicted se- 
quences). For the 20 bacterial isolates with the highest 
content of toxin and antitoxin predicted sequences, it 
varies from 37 to 97 per genome, representing from 
0.8% to 2.5% with respect to the total number of ORFs 
(Supplementary Table S9). These bacteria belong to four 
different phyla (Proteobacteria, Cyanobacteria, Green 
sulfur and Actinobacteria). High genome plasticity 
appears to be a common theme for most of these strains 
(described for 12 over 20, e.g. Microcystis aeruginosa, 
Gleobacter violaceum and Nitrosomanas europeae). They 
also tend to have a versatile metabolism (6 over 20, e.g. 
Nitrosomanas europeae, Rhodopseudomas palustris and 
Azoarcus sp.) or a particular metabolism (4 over 20, e.g. 
Geobacter uraniireducens, Pelodictyon phaeoclathratiforme 
and Caulobacter sp.). Two isolates are symbionts 
(Verminephrobacter eiseniae and Photorabdus luminescens) 
and another is part of phototrophic bacterial consortia 
(C. chlorochromatii). Five of the strains belong to the 
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Figure 5. Associations of toxin super-families. The 12 toxin 
super-families are indicated above the pie chart. Each section of the 
pie chart represents the relative abundance of antitoxin sequences be- 
longing to a given super-family associated with toxin sequences. Each 
antitoxin super-family is represented by a specific color. In (A), toxin 
super-families that are associated with multiple antitoxin super-families 
(>4). In (B), toxin super-families that are associated with three different 
antitoxin super-families. In (C), toxin super-families that are associated 
with AG sequences and another antitoxin super-family. Associations 
occurring at less than 1% were not considered for clarity. 



Mycobacterium tuberculosis complex and are pathogenic 
(or deriving from a pathogenic strain in the case of 
M. tuberculosis H37Ra). 



DISCUSSION 

Discovering novel toxin and antitoxin sequences 

The 'association by guilt 1 approach was already used to 
predict novel TA systems by Makarova et al. (24). The 
approach we used in this work allowed the identification 
of sequence pairs presenting unusual features such as a 
total absence of similarity with the 'original' sequences, 
a larger size for both the antitoxins and toxins (>200 
amino acid) and larger intergenic distances (up to 
300 bp). Comparison between their data and ours 



indicates that our approach predicted their sequences 
(data not shown) but, more importantly, confirmed that 
there are many more toxins and antitoxin families not 
related to 'original' ones or simply uncharacterized yet, 
than previously thought and that remain to be experimen- 
tally studied. The successful experimental validation of 
some of our predictions is a direct proof of the results 
presented in the two works. 

Activities of type II toxins 

Quite unexpectedly, the novel toxins identified in this 
work are all translation inhibitors. We cannot exclude 
that a bias might have been introduced by the experimen- 
tal validation of foreign genes in E. coli. Toxins targeting 
specific pathways might not be toxic for E. coli and there- 
fore not validated by our Killing/rescue assay. The 
counter-argument is that if these genes are involved in 
stable maintenance of mobile genetic elements and move 
through horizontal transfer, toxins should target 
conserved mechanisms to ensure their function and main- 
tenance in different bacterial hosts. 'Original' toxins in- 
hibiting translation use a variety of mechanisms from 
cleavage of free-RNAs, cleavage of mRNAs during trans- 
lation, inhibition of elongation by 30S binding and by 
EF-Tu phosphorylation (38-40), and it remains to be 
shown whether novel toxins will extend this list of 
activities. Mechanisms of action of these toxins are cur- 
rently being investigated in our laboratory. 

Origin and evolution of type II TA systems 

Evolution of the TA systems remains largely unknown. 
While some authors proposed that these systems might 
derive from a common ancestor (41), other groups are in 
favor of several ancestors (10,12). Our data indicate that 
the validated toxins (both 'original' and those discovered 
in this analysis) constitute 12 unrelated super-families. No 
structural homologs could be detected for the novel toxins 
in the PDBsum structure database except for SpyTl M1 . 
The structure of its homolog is unrelated to type II 
toxins. This indicates that the novel toxins do not share 
any evolutionary relationship with 'original' 
super-families, although they are functionally related. 
However, only the resolution of their three-dimensional 
structures will unambiguously answer this question since 
similar structures have been detected for toxins that do 
not show significant sequence similarity (42). Subsequent 
divergence is observed for toxins within the ParE/RelE 
and CcdB/MazF super-families as they acquired different 
activities (DNA-gyrase inhibitors and mRNA- 
interf erases). 

For the antitoxins, our data indicate that they form 20 
unrelated super-families. Further analyses will be required 
to define whether or not these antitoxins have a 
DNA-binding domain. Retracing antitoxin evolution 
might be complex since antitoxins are composed in 
general of two independent domains, an ammo-terminal 
DNA-binding domain and a carboxy-terminal 
toxin-binding domain. DNA-binding domains might be 
recruited from or by other types of activities totally unre- 
lated to antitoxins. A variation on the theme is observed in 



5522 Nucleic Acids Research, 2011, Vol. 39, No. 13 



A Phd 
B Re IB 

C CcdA 



HigA 



AG 



D FizF 



MazE 



PasA 



FizD 



VapX 



Epsilon ParD FizE 



FizH 



FizG FizA 



FizC 



0 > 



FizJ FizK 



Fizl 



FizB 







AG 

Doc 
| CcdB/MazF 
| HipA 
| ParE/RelE 
| VapC 

CcdA 

GinA 

VapD 
| GinC 

Zeta 
| YafO 
I GinD 



Figure 6. Associations of antitoxin super-families. The 20 antitoxin super-families are indicated above the pie chart. Each section of the pie chart 
represents the relative abundance of toxin sequences belonging to a given super-family associated with antitoxin sequences. Each toxin super-family is 
represented by a specific color. In (A), antitoxin super-families that are associated with multiple toxin super-families (five or more). In (B), antitoxin 
super-families that are associated with three or four different toxin super-families. In (C), antitoxin super-families that are associated with AG 
sequences and another toxin super-family. In (D), antitoxin super-families that are associated with a specific toxin super-family or to AG sequences. 
Associations occurring at <1% were not considered for clarity. 
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the case of three-component systems. In these systems 
[epsilon-omega-zeta (43) and paaRl-paaAl-parE3 (33)], 
DNA-binding and antitoxin activities are encoded by 
two separated polypeptides. Molecular mechanisms 
underlying transcriptional regulation of three-component 
systems remain to be determined. 



80 




Total number CDS 

Figure 8. No correlation between the total number of CDS and the 
number of predicted antitoxin and toxin sequences. Only 'original' and 
'similar' antitoxin and toxin sequences were considered. 



Abundance and diversity 

Another level of diversity is observed for antitoxin 
and toxin associations. Sequences of the RelE/ParE, 
CcdB/MazF and VapC super-families are paired with 
antitoxin sequences of the RelB, Phd, MazE and PasA 
super-families, supporting the idea that antitoxins and 
toxins were assembled multiple times as suggested previ- 
ously (12). Sequences belonging to these super-families are 
abundant and present on plasmids. Other super-families 
are less promiscuous in their association. The reason is 
unclear, although super-families such as GinC, GinD, 
YafO, FizH or FizI are less represented and tend to be 
phyla- and chromosome-specific (although we cannot ex- 
clude that some are located on genomic islands), which 
might reduce their possibilities of association. However, 
although CcdA, Epsilon and ParD antitoxin sequences are 
present on plasmids, and therefore should be prone to 
horizontal gene transfer, they are phyla-specific and not 
abundant. Thus, what makes the evolutionary success of a 
given antitoxin or toxin super-family is still largely 
unknown. We have to keep in mind the biases introduced 
by the fact that a majority of the fully sequenced genomes 
available at NCBI are of Proteobacteria (745/1566 
genomes) and that genomic islands within chromosomes 
remain difficult to identify. 

Nevertheless, in general, these entities are abundant in 
bacterial chromosomes. Some genomes contain as much 
as 97 predicted antitoxin and toxin sequences 
(Supplementary Table S9), representing 1.5% with 
respect to the total number of CDSs. An even higher per- 
centage is observed in some obligate intracellular bacterial 
genomes (Table 3). Predicted antitoxin and toxin se- 
quences in R. belli and R. felis represent as much as 
2.2% to 2.6%, respectively. Interestingly, these strains 



Table 3. Predicted antitoxin and toxin sequences in intracellular 
obligate species 



Bacterial species 


Phyla 


CDS in 


Predicted 






genome 


A and 








T (%) 


Buchnera aphidicola str. APS 


Proteobacteria 


564 


0 


Buchnera aphidicola str. Bp 


Proteobacteria 


504 


0 


Buchnera aphidicola sir. Cc 


Proteobacteria 


357 


0 


Buchnera aphidicola str. Sg 


Proteobacteria 


546 


0 


Chlamydia muridarum Nigg 


Chlamydiae 


404 


0 


Chlamydia trachomatis 434/Bu 


Chlamydiae 


S74 


0 


Chlamydia trachomatis A/HAR-13 


Chlamydiae 


911 


0 


Chlamydia trachomatis DjUW-3jCX 


Chlamydiae 


895 


0 


Chlamydia trachomatis 


Chlamydiae 


874 


0 


L2hjUCH-l I proctitis 








Chlamydophila abortus S26/3 


Chlamydiae 


932 


0 


Chlamydophila caviae GPIC 


Chlamydiae 


998 


0 


Chlamydophila felis FejC-56 


Chlamydiae 


1005 


0 


Chlamydophila pneumoniae AR39 


Chlamydiae 


1112 


0 


Chlamydophila pneumoniae CWL029 


Chlamydiae 


1052 


0 


Chlamydophila pneumoniae J138 


Chlamydiae 


1069 


0 


Chlamydophila pneumoniae TW-1S3 


Chlamydiae 


1113 


0 


Rickettsia akari str. Hartford 


Proteobacteria 


1259 


23 (1.8) 


Rickettsia bettti OSU 85-389 


Proteobacteria 


1476 


32 (2.2) 


Rickettsia bellii RML369-C 


Proteobacteria 


1429 


26 (1.8) 


Rickettsia canadensis str. McKiel 


Proteobacteria 


1093 


1 (0.09) 


Rickettsia conorii str. Malish 7 


Proteobacteria 


1374 


13 (0.9) 


Rickettsia felis URR WXCal2 


Proteobacteria 


1400 


36 (2.6) 


Rickettsia massiliae MTV 5 


Proteobacteria 


968 


16 (1.7) 


Rickettsia prowazekii str. Madrid E 


Proteobacteria 


835 


0 


Rickettsia rickettsii str. 'Sheila Smith' 


Proteobacteria 


1345 


14 (1.0) 


Rickettsia rickettsii str. Iowa 


Proteobacteria 


1384 


13 (0.9) 


Rickettsia typhi sir. Wilmington 


Proteobacteria 


838 


0 


Mycobacterium leprae TN 


Actinobacteria 


1605 


1 (0.06) 


Wigglesworthia glossinidia 


Proteobacteria 


611 


0 


Wolbachia endosymbiont of 


Proteobacteria 


1195 


6 (0.5) 


Drosophila melanogaster 








Wolbachia endosymbiont strain 


Proteobacteria 


805 


0 


TRS of Brugia malayi 








Wolbachia pipientis 


Proteobacteria 


1275 


2 (0.2) 



For each genome, the total number of CDS is indicated. 
The total number of 'similar' and AGA sequences belonging to the 
super-families described in Supplementary Tables S2 and S3 were con- 
sidered with the exception of those belonging to the Zeta and HigA 
super-families as many false-positive sequences are suspected. 
The percentage of predicted sequences with respect to the total number 
of CDS in the genome is indicated between brackets. 
A for antitoxin, T for toxin. 



also contain a high number of repeat sequences (3.7% 
and 4.4%, respectively) and other genes indicative of 
genome plasticity (e.g. phage-related genes, transposases) 
(44,45). In contrast, other strains of the Rickettsia genus 
such as R. typhi and R. prowezaki, appear to be devoid of 
predicted antitoxin and toxin sequences (Table 3). 
Interestingly, both strains are closely related in the 
Rickettsia phylogeny and have undergone massive reduc- 
tive evolution (a general phenomenon observed in obligate 
intracellular bacteria) as compared to R. felis and R. bellii 
(46). The absence of TA systems in some obligate intra- 
cellular bacteria might be due to massive gene loss. The 
comparison of M. leprae and its free-living relative 
M. tuberculosis complex strains supports this idea (one 
predicted sequence versus 43-45) (Table 3 and 
Supplementary Table S9). A contrario, a high content of 
TA systems might reflect an intense gene flux 
(Supplementary Table S9). Interestingly, R. felis carries 
a putative conjugative plasmid (44) and might still be 
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prone to horizontal gene transfer. This could possibly 
explain the presence of predicted toxin and antitoxin se- 
quences in this strain. 

As a conclusion, we propose that toxin and antitoxin 
genes have emerged and assembled multiple times during 
evolution. Associated with mobile genetic elements, TA 
systems promote their stability and propagate efficiently 
within the bacterial world. Because of their addictive 
properties, they integrate stably in bacterial chromosomes 
without providing necessarily selective advantages to their 
hosts. Interestingly, most of the toxin families are transla- 
tion inhibitors, suggesting that this activity might have 
been selected rather than more detrimental ones such as 
DNA-gyrase inhibitors. 



SUPPLEMENTARY DATA 

Supplementary Data are available at NAR Online. 
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